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Introduction 


Genome-wide association studies (GWAS) of breast cancer have been completed 
among populations of European ancestry, and several regions have been identified that 
appear to contribute susceptibility to this cancer. Recent data suggests that not all risk 
alleles for common cancers will be revealed however by studies limited to Whites of 
European ancestry, and that similar efforts in other racial and ethnic populations will be 
needed to identify the full spectrum of common risk alleles that contribute to disease risk 
in the population. To identify genetic risk alleles for breast cancer risk among African 
American women we have performing a well-powered whole-genome association scan. 
For this project we have established a collaborative network of investigators whose 
careers have been dedicated to studying breast cancer in minority populations who have 
contributed samples and covariates from each of their respective studies to identify 
genetic variants that contribute to risk of breast cancer in this minority population. We 
have completed a GWAS of >1.1 SNPs in >3000 African American breast cancer cases 
and >2,700 controls. With these data we have validated and improved upon markers of 
risk at the known breast cancer risk regions that better characterize their contribution to 
breast cancer risk in women of African ancestry. In collaboration with GWAS in 
populations of European ancestry we have also revealed a novel risk locus for estrogen 
receptor negative breast cancer. 
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BODY 


The Specific Aim of this application is to identify genetic risk alleles for breast cancer among 
African American women by performing a well-powered genome-wide association study 
(GWAS). For this project, I have established a network of leaders in the breast cancer research 
community with long-standing interests in breast cancer research in African Americans, all of 
whom have existing case-control studies of breast cancer in the U.S. Funding for the genotyping 
of samples from the MEC, CARE, WCHS, SFBC and BCFR studies is covered by this DOD- 
BCRP grant. Genotyping was conducting using the lllumina Infinium 1M. The genotyping of the 
other studies has been provided by a number of other sources. Stage 1 of the GWAS included 9 
epidemiological studies of invasive breast cancer among African American women, which 
comprise a total of 3,153 cases and 2,831 controls. Details of the participating studies, 
genotyping and statistical analysis of the GWAS data have been provided in previous progress 
reports. Here we present results for two specific analyses: 1) fine-mapping of the known breast 
cancer risk loci, and 2) a meta-analyses of GWAS for estrogen receptor negative breast cancer. 

Fine-Mapping of Breast Cancer Susceptibility Loci Characterizes Genetic Risk in African 
Americans 

We tested common genetic variation at the breast cancer risk loci identified in women of 
European and Asian descent in the stage 1 African American breast cancer sample to identify 
markers of risk that are relevant to this population. More specifically, we examined the index 
variants and conducted fine-mapping of the locus to both improve the current set of risk markers 
in African Americans as well as to identify new risk variants for breast cancer. We then applied 
this information to model breast cancer risk in African American women in attempt to 
characterize the spectrum of genetic risk in this population defined by common variants at the 
known risk loci. 

We tested the 19 validated breast cancer risk variants (referred as “index variants”) at 

I pi 1, 2q35, 3p24, 5p12, 5q11, 6q25, 8q24, 9p21, 9q31, 10p15, 10q21, 10q22, 10q26, 11p15, 

II q 13, 14q24, 16q12, 17q23 and 19p13 in models adjusted forage, study, global ancestry (the 
first 10 eigenvectors) and local ancestry; 1 " 6 17 SNPs were directly genotyped, while 2 were 
imputed using MACH (r 2 >0.98). All 19 variants were common (>0.05) in African Americans, with 

11 variants being more common in Europeans than African Americans (Figure 1). In previous 
GWAS, the index signals had very modest odds ratios (1.05-1.29 per copy of the risk allele) and 
our sample size provided >70% statistical power to detect the reported effects for 12 of the 19 
variants (at P<0.05). We observed positive associations with 11 of the 19 variants (OR >1) 
however only 4 were statistically significant (P<0.05 at 2q35, 9q31, 10q26 and 19p13). Of the 
15 variants that were not replicated at P<0.05, statistical power was <70% for only 7 of the 
variants. Although power was more limited, we also evaluated associations by estrogen 
receptor (ER) status as some risk variants have been found to be more strongly associated with 
ER-positive (ER+) or ER-negative (ER-) breast cancer. We observed positive associations with 

12 variants (2 at P<0.05) for ER+ disease (n=1,520) and with 9 variants for ER- (3 at P<0.05; 
n=988). For only one variant did we observe statistically significant risk heterogeneity by ER 
status (rsl 3387042 at 2q35, P=0.013). 

Aside from statistical power, the lack of a statistically significant association with an 
index variant (OR>1 and p<0.05) suggests that the particular variant revealed in the GWAS 
populations may not be adequately correlated with the biologically relevant allele in African 
Americans. In an attempt to identify a better genetic marker of risk in African Americans we 
conducted fine-mapping across all risk regions using genotyped SNPs on the lllumina 1M array 
and imputed SNPs to Phase 2 HapMap populations. If a marker associated with risk in African 
Americans represents the same signal as that reported in the initial GWAS, then it should be 
correlated to some degree with the index signal in the GWAS population. Using HapMap data 
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for the populations in which the risk variant was identified (Utah residents with ancestry from 
northern and western Europe (CEU), or Han Chinese in Beijing, China (CHB)), we catalogued 
and tested all SNPs that were correlated (r 2 >0.2) with the index signal (within 250kb), applying 
an a a of 3.2x10 3 which was estimated as 0.05/the average number of tags needed to capture 
(r 2 >0.8) the common risk alleles correlated with the index allele in each region in the Yoruba 
HapMap population (in Ibadan, Nigeria (YRI)). We also tested for novel independent 
associations, focusing on SNPs that were uncorrelated with the index signal in the initial GWAS 
populations. Here, we applied a Bonferroni correction for defining novel associations as 
statistically significant in each region, with a b estimated as 0.05/the total number of tags needed 
to capture (r 2 >0.8) all common risk alleles in the 19 regions in the YRI population (a b =1.0x10" 5 ; 
similar to the genome-wide-type correction of 5x1 O' 8 , which accounts for the number of tags 
needed to capture all common alleles in the genome). For each region, stepwise logistic 
regression was used with SNPs kept in the final model based on a a or a b . These procedures 
were applied to all cases and controls as well as in hypothesis-generating analyses stratified by 
ER status. At 9 loci we detected variants that were statistically significantly associated with 
breast cancer risk in African Americans. These regions include 9q31 where the sole marker of 
risk was the index signal (rs865686: OR=1.08; P=0.034). Through fine-mapping we revealed 
markers in four regions that were more significantly associated with risk than the index signal 
(>1 order of magnitude change in the p-value) and are likely capturing the same signal (2q35, 
5q11, 10q26 and 19p13). We also identified markers in four regions that are not correlated with 
the index signal in the GWAS populations (8q24, 10q22, 11 ql3 and 16q12) and may represent 
putative novel risk variants, with one being specific for ER+ disease (8q24). These regions are 
discussed below. 

Risk variants that better define the index signal in African Americans 

2q35 

The index signal at 2q35 was statistically significantly associated with risk of overall breast 
cancer (rsl 3387042: OR=1.12, P=7.5x10‘ 3 ) and ER+ disease (OR=1.22, P=2.6x10" 4 ). However, 
we found stronger associations with two markers that are each modestly correlated with the 
index signal in CEU and YRI: rsl 3000023 with overall breast cancer (OR=1.20, P=5.8x10' 4 ) and 
rs12998806: with ER+ disease (OR=1.39, P=3.3x10' 6 ). The signal in this region appeared 
limited to ER+ breast cancer, which is consistent with the initial report of this risk locus. 3 
5q11 

We found a positive non-significant association with the index signal at 5q11, which is located 
79 kb centromeric of the MAP3K1 gene (rs889312: OR=1.07, P=0.084). Fine-mapping revealed 
statistically significant associations with markers, rsl 6886165 for overall breast cancer 
(OR=1.15, P=6.5x10' 4 ) and rs832529 for ER- disease (OR=1.22, P=1.3x10' 3 ). These SNPs 
show greater correlation with the index signal in Europeans (CEU, r 2 =0.40 and 0.46) than in 
Africans (YRI, r 2 <0.01 and r 2 =0.09), which suggests that they may be better markers of the 
biologically functional variant in African Americans. 

10q26 

Both the index signal, rs2981582 (OR=1.11, P=8.6x10' 3 ), and rs2981578, that was identified 
previously through fine-mapping in African Americans (which some of these studies contributed 
to) 7 , were statistically significantly associated with risk (OR=1.24, P=1.7x10' 4 ). Variant 
rs2981578 was the most strongly associated marker in the region for overall breast cancer and 
for ER+ disease, which is consistent with previous reports of variation in this region being more 
strongly associated with ER+ breast cancer. 8 In fine-mapping the locus we observed a 
suggestive association with a correlated marker and ER- disease (rs2912774: OR=1.19, 
P=2.1xl0' 3 ) however the association was also noted with ER+ disease (OR=1.10, P=0.041) and 
is likely capturing the same signal as rs2981578. 
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19p13 

19p 13 was the first risk locus reported to harbor a variant that may be specific for ER- disease. 9 
In African Americans, the index variant was statistically significantly associated with risk of 
overall breast cancer (rs2363956: OR=1.14, P=8.0x10" 4 ), as well as ER+ (OR=1.12, P=0.016) 
and ER- disease (OR=1.14, P=0.01). The most significant association in the region for overall 
breast cancer and ER+ disease was with rs3745185 (P=3.7x10" 5 and P=8.2x10‘ 4 , respectively), 
which is likely to be capturing the same functional variant (r 2 =0.57 in CEU and 0.19 in YRI). The 
most significant marker for ER- breast cancer was correlated with both rs2363956 and 
rs3745185 (rs11668840: OR=1.25, P=5.1x10' 5 ). 

Novel risk-associated markers at breast cancer susceptibility loci. 

8q24 

Given the importance of the 8q24 locus in cancer, we conducted association testing across the 
entire cancer risk region (126.0 Mb-130.0 Mb). 10,11 The index signal (rs13281615) was not 
statistically significantly associated with risk in African Americans, nor did we identify significant 
associations with correlated SNPs. However, we did detect a significant association with 
rs16902056 and ER+ breast cancer (risk allele frequency, 0.95; P=6.7x10' 6 ; ER-: P=0.66). This 
SNP is located 78 kb centromeric of the index variant and is not correlated with the index variant 
(r^cO.OI in CEU and r 2 =0.027 in YRI). No statistically significant associations were observed 
with variants found previously in association with cancers of the bladder and ovary, or leukemia 
(rs9642880: OR=1.03, P=0.58; rs10088218: OR=1.02, P=0.62; rs2456449: OR=1.07, P=0.14). 
Of the known risk variants for prostate cancer we found a single nominally significant (P<0.05) 
association with the same risk allele of rs1016343 (P=0.015) which is located >260 kb 
centromeric of the breast cancer risk region and is not correlated with rs13281615 or 
rsl 6902056. 

10q22 

We observed no association with the index signal at 10q22 (rs704010) which is located in intron 
1 of the gene ZMIZ1 , or with any correlated markers. However, we did detect strong evidence of 
a second signal located 215 kb telomeric in intron 12 of the gene ZMIZ1 (rsl 2355688: OR=1.24, 
P=6.8x10' 6 ). This putative novel risk variant is not correlated with the index variant in the CEU 
or YRI populations (r 2 <0.01). 

11q13 

No positive association was noted with the index variant at 11 ql3. However, we did detect 
evidence of a second independent signal (rs609275: OR=1.20, P=1.0x10' 5 ), located 74 kb 
telomeric, and 53 kb centromeric of CCND1. The variant is monomorphic and uncorrelated with 
the index signal in the CEU population; and r 2 with the index signal in the YRI population is 
< 0 . 01 . 

16q12 

As in previous studies of African Americans we were not able to replicate the association signal 
defined by the index variant rs3803662. 1213 A recent study of African Americans reported a 
suggestive association with SNP rs3104746, which is located 15 kb telomeric of rs3803662. 14 
This SNP has a minor allele frequency of 0.04 in the HapMap CEU population, 0.19 in our 
African American controls, and is modestly correlated with rs3803662 in Africans (r 2 =0.31 in 
YRI), but not in Europeans (r 2 =0.038). Fine-mapping around this putative signal revealed a 
perfect proxy (r 2 =1) for rs3104746, rs3112572, which is significantly associated with breast 
cancer risk in African Americans (OR=1.18, P=3.9x10' 4 ) with the association noted to be 
stronger for ER+ breast cancer (OR=1.27, P=3.1xl0' 5 ). 

For index SNPs found to be nominally associated with breast cancer risk, as well as risk- 
associated markers identified through fine-mapping, we also tested for associations by 
genotype. Results from the genotype-specific model were consistent with log-additive- 
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associations. Risk variants at 2q35 and 8q24 were also found to have significantly stronger 
associations with ER+ breast cancer than ER- disease which is consistent with previous 
studies. 8 

We observed no statistically significant associations with common variation at 10 risk loci 
on 1 pi 1,3p24,5p12,6q25,9p21,10pl5,10q21,11 pi5,14q24 and 17q23. 

Risk modeling 

We next estimated the cumulative effect of all breast cancer risk variants, and compared a 
summary risk score comprised of unweighted counts of all GWAS reported risk variants to a risk 
score that included variants we identified as being associated with risk in African Americans. 
Using the 19 index signals from GWAS, the risk per allele was 1.04 (95% Cl, 1.02-1.06; 
P=6.1x10' 5 ) and individuals in the top quintile of the risk allele distribution were at 1.4-fold 
greater risk (P=7.4x10" 5 ) of breast cancer compared to those in the lowest quartile. As 
expected, the risk score was improved when utilizing the markers that we identified at the 
known risk loci as being more relevant to African Americans (8 alleles for overall breast cancer: 
2q35, 5q11, 9q31, 10q22, 10q26, 11q13, 16q 12 and 19p13; OR=1.18; 95% Cl, 1.14-1.22; 
P=2.8><10' 24 ), with risk for those in the top quartile being 2.2-times that observed in the lowest 
quintile (P=3.6x10" 17 ). We observed an increase of 1.9 percentage points in the area under the 
curve (AUC) (P=2.6><10' 6 ). This score was significantly associated with risk of both ER+ 
(OR=1.20, P=1.7x10' 19 ) and ER- (OR=1.15, P=2.8x10' 9 ) disease (P het =0.12). 

Stratifying by first-degree family history of breast cancer differentiated risk further with 
those with a family history and in the top quintile of the risk score distribution (4% of the 
population) having a 3.4-fold greater risk (P=9.9x10' 14 ) compared to those without a family 
history and in the lowest quintile of the risk score. 

In hypothesis-generating analyses, we also developed risk scores for ER+ and ER- 
breast tumor subtypes utilizing the most informative markers revealed through fine-mapping of 
each phenotype. These phenotype-specific scores were highly significant (ER+: OR=1.30, 
P=6.0x10' 18 ; ER-: OR=1.20, P=2.3x10' 10 ) with statistically significant heterogeneity noted when 
the scores were applied to the other subtype (P he t=1-7x10' 5 and 5.0x10 3 for ER+ and ER- 
scores, respectively). 

Summary 

In this large study of breast cancer in African American women we were able to replicate 
associations with 4 of the 19 index variants (at P<0.05). Through fine-mapping, we observed 
that overall breast cancer risk was statistically significantly associated with markers in 4 regions 
which are likely to capture the GWAS-reported signal and to serve as better markers of the 
functional allele and risk in African Americans. We also detected putative novel associations 
that are independent of the index signals in 3 regions for overall breast cancer (10q22, 11 ql3 
and 16q 12) and in one region for ER+ disease (8q24). In 10 of the risk regions, however, we 
were not able to replicate the GWAS index signals, nor did we detect statistically significant 
associations of common SNPs with breast cancer risk at the levels of statistical significance we 
set for fine-mapping. 

In the four regions we observed risk markers that are correlated and in the same LD 
block with the index markers in CEU (rsl3000023 at 2q35, rsl6886165 at5q11, rs2981578 at 
10q26 and rs3745185 at 19p13; r 2 >=0.35). It is likely that these risk markers capture the same 
signal as the index markers do, however we cannot rule out the possibility that some of them 
may represent a second, independent signal in the same region. Since the r 2 between these 
markers and the index markers are higher, the second signals they pick up are likely to be in the 
same LD as the index signal in CEU, which would be captured by the index marker in the 
original GWAS, which justified the more liberal P-value we used to select them. 
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In the four regions where we observed independent signals, the risk alleles (rsl6902056 
at 8q24, rs12355688 at 10q22, rs609275 at 11 q 13 and rs3112572 at 16q12) were uncorrelated 
with (r 2 <0.04) and not in the same LD block as the index variant in Europeans (CEU) [distances 
from the index signal ranged from 14kb at 16q12 to 215kb at 10q22]. Therefore, these variants 
are likely to pick up a novel signal independent of the index signal. At 10q22, both the index 
SNP and the novel variant are located within introns of the ZMIZ1 gene. ZMIZ1 encodes zinc 
finger MIZ-type containing 1, which regulates the activity of various transcription factors 
including the androgen receptor, Smad3/4, and p53. At 11 q 13, rs609275 lies 74 kb telomeric of 
the index signal and in closer proximity to a number of candidate genes including CCND1 
(encoding cyclin D1), a protein crucial for cell cycle control, ORAOV1 (encoding oral cancer 
overexpressed 1) and FGF19 (encoding fibroblast growth factor 19). The association at 16q 12 
confirms the findings of a previous, smaller study of African Americans, 15 and is consistent with 
a previous fine-mapping study suggesting that African Americans may harbor a separate causal 
variant in this region. 12 Whether this variant is influencing the same genes/pathways as the 
index variant rs3803662 is not known, however the stronger associations noted for both variants 
with ER+ disease suggest that they may affect the same biological process. 

Notably, at region 19p13 which was originally reported in association with ER- breast 
cancer, 9 the index signal was statistically significantly associated with both ER+ and ER- 
subtypes in African Americans. In addition, we found a stronger marker in this region 
(rs3745185) for ER+ as well as overall breast cancer risk. We also found stronger associations 
with ER+ than ER- disease for variants in many regions, including 2q35, 8q24, 10q26 and 
16q 12, which is consistent with previous reports. 8 We also found strong signals for ER- disease 
in regions 5q11, 10q26 and 19p13. While there haven’t been any reported associations between 
these signals and ER- disease in European-ancestry populations, it is possible that they explain 
some of the excess risk for ER- disease in African-Americans, since these risk alleles have 
higher frequencies in this population than they do in European-ancestry populations. 

The majority of the variants identified by GWAS for common cancers are of low risk 
(relative risks <1.30) and in aggregate are not yet informative for risk prediction. Until the 
functional alleles at each susceptibility locus are identified and their effects are accurately 
estimated, modeling of the genetic risk will rely on markers that best capture risk for a given 
population. Many of the markers we identified at these risk loci appear to provide improvement 
over the GWAS-identified variants in defining African American women who are at greater risk 
of breast cancer. The risk score for overall breast cancer was also equally efficient for ER+ and 
ER- tumors. However, our hypothesis-generating model suggests that identification of tumor 
subtype-specific variants will improve the fit of these models. 

While this is the largest study of African Americans to date to investigate genetic risk at 
known breast cancer susceptibility loci, statistical power was still limited. We had only 35% 
power to detect an OR of 1.10 for a risk allele of 0.10 frequency, which may account for our 
inability to replicate GWAS signals or risk-associated markers in 10 of the regions. While 
attempting to apply a strict threshold for declaring significance through fine-mapping, we did not 
take into account testing for multiple phenotypes (overall breast as well as ER+ and ER- 
disease). As a result, the a levels used as selection criteria may be too liberal. However, our risk 
modeling focused on the variants revealed for overall breast cancer, whereas we consider the 
associations observed for markers identified for ER+ or ER- disease and used in the subtype- 
specific risk modeling as hypothesis-generating. Since all of the cases and controls used for 
fine-mapping/discovery were also included in the risk modeling, the risk model is likely to over¬ 
estimate the level of prediction due to winner’s curse. Instead of partitioning the sample into test 
and validation sets, we felt it was necessary to use all of the subjects in the association testing 
of known variants and in fine-mapping to increase the statistical power to detect associations in 
each region. Therefore, other comparably large studies of African Americans must be performed 
in the future to test the model presented. 
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A manuscript describing these findings in under review at Human Molecular Genetics. 


Meta-Analyses with other Breast Cancer GWAS 

We have also initiated meta-analyses with other GWAS including a GWAS of triple negative 
breast cancer (PI, Fergus Couch) and a GWAS of estrogen receptor negative breast from the 
NCI Breast and Prostate Cancer Cohort Consortium (BPC3). From these meta-analyses we 
have revealed a novel risk locus for estrogen receptor negative breast cancer. 

A Novel Risk Locus for Estrogen Receptor Negative Breast Cancer 

Compared to women of European ancestry, women of African descent are more likely to be 
diagnosed with estrogen receptor negative breast cancer. 16 ER negative and triple negative 
tumors, which are deficient in the expression of estrogen, progesterone (PR) and human 
epidermal growth factor-2 (HER2) receptors, are observed at even higher rates among African 
women in Africa 17 , suggesting a genetic component to the high risk of ER negative phenotypes 
in women of African descent. Similarly, ER negative breast cancers and triple-negative breast 
cancers are also the predominant histological subtypes in women with germline mutations in 
BRCA1. 18 The enrichment for ER negative disease in this genetically predisposed population 
also suggests the existence of additional genetic factors that contribute to the risk of ER 
negative disease. Support for the presence of these factors was recently provided by a genome¬ 
wide association study (GWAS) of breast cancer in BRCA1 mutation carriers, in which a 
common risk variant for ER negative breast cancer on chromosome 19p13 was identified that 
also displayed significance in ER negative and triple negative disease in the general 
population. 9 

To search for genetic risk factors for ER negative breast cancer phenotypes, we 
combined results from our GWAS of breast cancer in African American women [AABC: 3,016 
cases (988 with ER negative disease) and 2,745 controls] with results from a GWAS of triple 
negative breast cancer in women of European ancestry (TNBCC: 1,718 cases and 3,670 
controls). In TNBCC, cases were genotyped with the lllumina 660W array. Genotypes of 
TNBCC cases were compared with GWAS data for publicly available controls. Both studies 
imputed genotypes for common SNPs in Phase 2 HapMap populations (release 21). A total of 
3,154,485 SNPs, genotyped and imputed were analyzed in stage 1 of the meta-analysis. In the 
combined results, only SNP 241 at chromosome 5 displayed a genome-wide significant 
association with ER negative breast cancer (AABC: OR per allele=1.32, p=1.9x10" 6 ; TNBCC: 
OR=1.25, p=1.2x10' 3 ; combined OR =1.29, p=1.0x10' 8 ). 

To further confirm the association on chromosome 5, we genotyped SNP 241 in women 
of European ancestry, which included 8,313 cases (1,308 ER negatives) and 10,879 controls 
from the NCI Breast and Prostate Cancer Cohort Consortium (BPC3) and 6,307 cases (813 ER 
negatives) and 6,722 controls from Studies of Epidemiology and Risk Factors in Cancer 
Heredity (SEARCH). Evidence for replication was observed for 241 and ER negative breast 
cancer in both studies (BPC3: OR=1.10, p=0.053; SEARCH: OR=1.21, p=1.7x10' 3 ). 

In combining the results across all studies (5,874 ER negative cases and 21,389 
controls with genotype data), SNP 241 was significantly associated with an increased risk of ER 
negative breast cancer (OR = 1.19, 95% Cl, 1.13-1.25; p=1.6 xIO' 10 ). The risk for heterozygote 
and homozygote carriers was 1.14 (95 % Cl, 1.06-1.23) and 1.46 (95% Cl, 1.29-1.65), 
respectively. We observed little evidence of heterogeneity for the reported association for this 
variant by study/country in AABC (p het =0.97), TNBCC (p he t=0.88) or BPC3 (p he t=0.41). 

In an analysis of ER positive cases, SNP 241 was only weakly associated with risk in 
African Americans (AABC: 1,518 ER positive cases and 2,743 controls with genotype data: 
OR=1.08; p=0.10) and in women of European ancestry (BPC3: 4,671 ER positive cases and 
10,397 controls, OR=1.03, p=0.26; SEARCH: 3,359 ER positive cases and 6,722 controls, 
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OR=1.02, p=0.52; combined for all populations: OR=1.04, p=0.06, p He t = 0.61). This result 
suggests that the association with breast cancer might be specific for ER negative disease (P- 
value for case-only test of ER negative versus ER positive = 4.0x1 O' 4 ). 

Summary 

Similar to 8q24 10 ’ 19 ' 20 and 11q13 21-23 , the TERT/CLPTM1L locus harbors multiple risk variants for 
different cancers (reviewed in 24 ). SNP 241 is modestly correlated (r 2 =0.13-0.43 in 1000 
Genomes Project populations of European and African ancestry) with variants found for serous 
ovarian cancer (rs7726159), glioma (rs2736100), and lung cancer (rs2736100, rs2735940). 25-27 
Aside from risk variant rs2853676 found for glioma 27 that was associated with risk in TNBCC 
(p=0.014, r 2 =0.05 with SNP 241), none of the known risk variants identified for other cancers in 
the TERT/CLPTM1L region were significantly associated with breast cancer risk in TNBCC or 
AABC. The TERT gene encodes the catalytic subunit of telomerase which controls telomere 
length, a process linked with genomic instability and implicated in tumorigenesis. The TERT 
locus may highlight another biological process common to the pathogenesis of ER negative 
breast cancer and serous ovarian cancer that is also shared with other cancers. 

A manuscript describing these findings is currently under review at Nature Genetics. 
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Figure 1. Risk allele frequencies in Europeans and African Americans. 

The distribution of risk allele frequencies (RAF) for the 19 index SNPs in HapMap CEU (CHB for 
rs2046210) and African Americans (AA). The variants are sorted based on the RAF in CEU. 
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Key Research Accomplishments 


1) We conducted detailed fine-mapping of the known breast cancer risk loci and 
have validated and improved upon markers of risk that better characterize their 
contribution to breast cancer risk in women of African ancestry. 

2) We conducted the largest study to directly investigate genetic susceptibility to 
ER- breast cancer, which is an aggressive type of breast cancer that is 
associated with poor prognosis and decreased survival, and for which treatment 
options are limited. 

3) Of the 20 risk loci for breast cancer, only the association at chromosome 19 is 
limited to ER- breast cancer. The chromosome 5 locus becomes only the 2 nd risk 
locus that has been identified for ER- breast cancer. 

4) We have limited knowledge of the etiology of the ER- breast cancer. The 
association of a common variant at chromosome 5 with ER- disease 
substantiates epidemiological studies in demonstrating distinct etiologies for 
breast cancer subtypes as defined by tumor markers. An understanding of the 
biological pathways is the first step towards developing effective strategies for 
preventing and treating ER- disease. 

5) This is the first GWAS of breast cancer in African American women, a population 
in which the incidence of ER- disease is substantially greater than that in other 
populations. Our findings suggest that this the chromosome 5 locus may 
contribute to the greater incidence of ER- breast cancer in this population. 
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Reportable Outcomes 

• The findings presented in this progress report will be discussed at the DOD BCRP 
meeting in Orlando (August 2011). 

• Two papers under review at Human Molecular Genetics and Nature Genetics. 
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Conclusion 


Through fine-mapping of the breast cancer susceptibility regions in a large sample of 
African American women, we identified markers that improve breast cancer risk 
prediction for this population. In aggregate, the informative markers at the established 
risk loci allow for an improvement in modeling of breast cancer risk over GWAS-reported 
markers in African Americans (per allele OR=1.18, P=2.8x10" 24 vs. OR=1.04, P=6.1xl0' 
5 ). Validation and enhancement of this model is needed before risk modeling based on 
genetic variants of low risk can be implemented in the clinical setting. At chromosome 5, 
the identification of the variant directly responsible for the association will be required to 
fully address the extent to which this locus contributes to the greater incidence of ER 
negative and triple negative tumors in women of African ancestry. However, it is notable 
that both the risk allele frequency and the odds ratio for SNP 241 are greater in African 
American women (frequency, 0.57; OR=1.32, p=1.9x10' 6 ) than in women of European 
ancestry (frequency, 0.26; OR=1.15, p=2.2x10" 6 ). Based on these differences in 
frequency and effect size, and assuming this variant is an equally good surrogate for the 
biologically functional allele in each population, we estimate that this locus may be 
responsible for a notable fraction (25%, 95% Cl, 7-49%) of the greater incidence rate of 
ER negative breast cancer in women of African than European ancestry. Larger studies 
with well-characterized tumor pathology information will be needed to determine if the 
association we observed applies to all ER negative disease or tumor subtypes that 
include ER negative status as a component, such as triple negative breast cancer. Our 
findings provide further support for the presence of genetic susceptibility to ER negative 
breast cancer and demonstrate the importance of discovery efforts in multiple 
populations. 

A future direction of our work will be to continue to combine GWAS data from multiple 
populations (as we have done for the chromosome 5 locus). Such work is in progress 
and in preliminary analyses we have found 2 additional novel loci on chromosomes 6 
and 20 which we are currently pursuing in multiethnic replication studies. 

Revealing the genetic causes of breast cancer in each population will in time translate 
into more targeted preventive measures and treatment strategies for those at risk of 
developing the breast cancer. The risk locus on chromosome 5 is this first clue that there 
may be a genetic basis for the greater incidence of estrogen receptor negative breast 
cancer in women of African ancestry. 
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